Skip to content

Add Pipeline design#21

Open
Yancey0623 wants to merge 3 commits intosql-machine-learning:developfrom
Yancey0623:pipeline_design
Open

Add Pipeline design#21
Yancey0623 wants to merge 3 commits intosql-machine-learning:developfrom
Yancey0623:pipeline_design

Conversation

@Yancey0623
Copy link
Copy Markdown
Collaborator

@Yancey0623 Yancey0623 commented Mar 23, 2020

Add API design for Tekton Pipeline.

@Yancey0623 Yancey0623 changed the title [wip]Add Pipeline design Add Pipeline design Mar 23, 2020
@Yancey0623
Copy link
Copy Markdown
Collaborator Author

cc @typhoonzero

Comment thread doc/design.md

```python
skaffold_git = fluid.Git(
skaffold_git = fluid.git_resource(
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also update below descriptions: Please be aware that the call to fluid.Git doesn't include the name

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use fluid.git for simple?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use fluid.git for simple?

Just tracing the current implementation. I think we can update the API and design in another PR if need.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember that Tekton has a limited number of pre-defined resource types and git is one of them. I would suggest we keep it Git other than git_resource, because git_resouce is not a fullname; git_pipeline_resource is. But git_pipeline_resource is too long. It seems reasonable to use a short name Git for one of a few pre-defined types.

Comment thread doc/design.md Outdated
Comment thread doc/design.md Outdated

``` python
@fluid.pipeline
def tutorial(source_repo:"resource,git", web_image="resource,image"):
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

source_repo="resource,git"

Copy link
Copy Markdown
Collaborator Author

@Yancey0623 Yancey0623 Mar 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's okay, this is the parameter annotation instead of the default value, ref: https://www.python.org/dev/peps/pep-3107/#id29

Comment thread doc/design.md
build_skaffold_web = build_docker_image_from_git_source(source_repo, web_image)

deploy_web = deploy_using_kubectl(source_repo, web_image)
deploy_web.web_image.from(build_skaffold_web)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How to define dependency by not using input/output?

Copy link
Copy Markdown
Collaborator Author

@Yancey0623 Yancey0623 Mar 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can use the runAfter keyword, and I added a section PiepeLine with DAG to introduce how to construct the DAG.

Comment thread doc/design.md

``` yaml
apiVersion: tekton.dev/v1beta1
kind: Pipeline
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the lifecycle of a pipeline object?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pipeline object is like function definition in Python which includes Pipeline Tasks
A PipelineRun object would invoke the Pipeline Tasks as the dependency, can find some information from Task Status

Comment thread doc/design.md
Comment thread doc/design.md

This will result in the following execution graph:

``` text
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No whitespace between ``` and text

Comment thread doc/design.md

```python
skaffold_git = fluid.Git(
skaffold_git = fluid.git_resource(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember that Tekton has a limited number of pre-defined resource types and git is one of them. I would suggest we keep it Git other than git_resource, because git_resouce is not a fullname; git_pipeline_resource is. But git_pipeline_resource is too long. It seems reasonable to use a short name Git for one of a few pre-defined types.

Comment thread doc/design.md

```python
skaffold_image_leeroy_web = fluid.Image(
skaffold_image_leeroy_web = fluid.image_resource(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, "image" is one of a few pre-defined resource types. How about we keep the name Image.

Comment thread doc/design.md

```yaml
goapiVersion: tekton.dev/v1alpha1
apiVersion: tekton.dev/v1alpha1
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out!

Comment thread doc/design.md

### Pipeline

A `Pipeline` object is like function declaration, according to the [definition](https://github.com/tektoncd/pipeline/blob/master/docs/pipelines.md).
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the above text, we stated that a Task is like a function. Here we state the same with Pipeline. What is the difference between these two types of "functions"?

Comment thread doc/design.md
value: "spec.template.spec.containers[0].image"
```

The above `Pipeline` is referencing a `Task` called `deploy-using-kubectl` defined as follows:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is referencing => refers to

Comment thread doc/design.md
build_skaffold_web = build_docker_image_from_git_source(source_repo, web_image)

deploy_web = deploy_using_kubectl(source_repo, web_image)
deploy_web.web_image.from(build_skaffold_web)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This deploy_web.web_image.from syntax looks confusing. Python programmer do not do this with function parameters.

I am afraid that this weird design might come from the fact that a Pipeline is NOT similar to a function definition.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After reading more about Pipeline, I see it describes a DAG of tasks, where edges are data dependencies between tasks.

Thinking about the following example from the Tekton tutorial https://github.com/tektoncd/pipeline/blob/master/docs/pipelines.md#from:

- name: build-app
  taskRef:
    name: build-push
  resources:
    outputs:
      - name: image
        resource: my-image
- name: deploy-app
  taskRef:
    name: deploy-kubectl
  resources:
    inputs:
      - name: image
        resource: my-image
        from:
          - build-app

Using programming language idiom, it is simply function calls

deploy_kubectl(image=build_push(my_image))

It seems that what we expect users to write is

@fluid.pipeline
def build_and_deploy(image):
    deploy_kubectl(image=build_push(my_image))

where @fluid.pipeline should dry-run the function build_and_deploy to analysis the function dependencies, which is deploy_kubectl.image <- build_push, and generate the YAML definition of the Pipeline object.

I am not sure if the above suggestion is correct, or how reasonable it is. It has been a while I haven't use Tekton.

Comment thread doc/design.md
As the following example of `Pipeline spec` comes from Tekton Pipeline tutorials:

``` yaml
- name: lint-repo
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't look like a complete Kubernetes YAML. What is its kind?

Comment thread doc/design.md
- `from`: clauses on the PipelineResources needed by a Task.
- `runAfter`: clauses on the Pipeline Tasks.

As the following example of `Pipeline spec` comes from Tekton Pipeline tutorials:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which tutorial? We need a URL here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants